PROCEEDINGS 

OF SCIENCE 



Comparison of Chiral Fermion Methods 



in 

Robert G. Edwards* 1 

. Jefferson Lab, 12000 Jefferson Ave, Newport News, VA 23606, USA. E-mail: 



edwards@ j lab .org 



o 

Q ; Balint Job* 

UKQCD Collaboration 

| School of Physics, University of Edinburgh, King's Buildings 

Edinburgh EH9 3JZ, UK 
, E-mail: |bjoo@jlab. org] 

^ ; Anthony D. Kennedy 

School of Physics, University of Edinburgh, King's Buildings 
O ! Edinburgh EH9 3JZ, UK 

1^ | E-mail: |adk@ph .ed.ac.uk 

Kostas Orginos 

Jefferson Lab and Department of Physics, College of William and Mary 
P.O. Box 8795, Williamsburg, VA 23187-8795, USA 
E-mail: [kostas@jlab.org 



<D 

Urs WengeH* 

John von Neumann Institutfiir Computing NIC 
' Platanenallee 6, D-15738 Zeuthen, Germany 



E-mail: urs.wenger@desy.de 



We present a comparison of various five-dimensional representations of chiral fermions consider- 
ing their cost and residual chiral symmetry breaking. 



XXIIIrd International Symposium on Lattice Field Theory 

25-30 July 2005 

Trinity College, Dublin, Ireland 



* Speaker. 

"'"Submitted to PoS on 12th October 2005. 

* Address since September 16, 2005: Jefferson Lab, 12000 Jefferson Ave, Newport News, VA 23606, USA 
^Address since 1st October 2005: Institute for Theoretical Physics, ETH Zurich, CH-8093 Zurich, Switzerland. 



© Copyright owned by the author(s) under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike Licence. 



http: //pos . sissa . it / 



Comparison of Chi ml Fermion Methods 



Robert G. Edwards 



1. Introduction 

The realization of chiral symmetry on the lattice represents a major breakthrough in the field 
of nonperturbative studies of QCD - in particular it allows the numerical simulation of QCD on the 
lattice with realistically light fermion flavors. The key to this success is the overlap Dirac operator 
which is related to the domain wall formalism in five dimensions [[I], ^ Qj. 

This paper provides a comprehensive discussion of various 5D versions of the overlap Dirac 
operator where the matrix sign function is approximated by a rational function. We show explicitly 
how the different variants emerge from the different representations of the rational function and 
how they all reduce to the same 4D effective lattice Dirac operator which is the usual overlap 
operator or its Hermitian version. In particular we show that the standard domain wall formulation 
is simply just one specific member of a large class of 5D operators. 

An important goal of the numerical work is to evaluate the "cost" of various chiral fermion 
actions. We only consider 5D operators - the ultimate aim is their use in the force term in Hybrid 
Monte Carlo simulations. 

We will consider 5D formulations to the 4D Overlap Operator or its Hermitian form: 

D 4 (m) = -[(l+m) + (l-m)y 5 sgn(#)] (1.1) 

Df(m) = -^—Y 5 D 4 (m)=RY 5 + sga(H), R = (1.2) 
1 — m l—m 

where sgn(//) is approximated by a rational function e(H). There is a four dimensional space of 
algorithms we can exploit: 

• choice of kernel H, 

• representation (form) ofe(H). We consider partial fraction, continued fraction and Euclidean 
Cayley Transform representations, 

• approximation of sgn(//) ps £ n ,m(H) = q(h) ^ or some polynomials P n and Q m . This is usu- 
ally the set of coefficients that fix the particular approximation in the chosen representation, 
and 

• constraints - how are fermions introduced into the path integral. 

Let us consider the various representations of a rational function. The Partial Fraction form [Q, 



uses 



£2n-l,2n( x ) = X \ PO + Y. 7,' (I- 3 ) 



The Continued Fraction form [|| ^, |Kj uses 



£2n- 1 ,2n (x) = j5 X H l — (1.4) 

p lX + — 



1 

...+ 
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Figure 1: Maximal error of the Zolotarev approximation over the region [e, 1] as a function of e and L s . 



The Euclidean Cayley Transform form connects the overlap form to the domain wall form []7|, [TT|, 



12, 13, 14]. It uses 



£2n-\,2n{ x ) 



T-\x)-l 



2n 



1 - (D;X 



+ G>iX 



(1.5) 



r-i(x) + . 

T is the transfer matrix in the fifth dimension, and the ft), come from the specific approximation. 

There are various approximation functions for sgn(H). These approximations give coeffi- 
cients for all the representations mentioned above. One can also supplement the approximation 
with eigenvectors of the H operator to decrease errors [^] ; projection can be used in all represen- 
tations [|12|]. The original Higham-Tanh approximation [15] (a.k.a. Neuberger's Polar form [|5|]) 
is 

{l+x) 2n -{l-x) 2n 



£2n-\,2n(x) = tanh —2nln 



1 



1 +x 



tanh (In tanh l (xj) 



(\+x) 2n + (\-x) 2n ' 

We note that e 2n -i,2n{x) = e 2n -i,2n(l/*). 

Over a desired approximation region (x G [e, 1]), the approximation due to Zolotarev 



(1.6) 



with 



£2n,2n- 1 (*) — xR n n (x ) , R n n (x 



C21 = — K sn (2iKl jn , k) C21- 1 



I^i^ + ca-i] 



2/ — 1 
-K sn iK , K 



(1.7) 



(1-8) 



has much reduced errors for a fixed number of poles compared to the "tanh" approximation. Here, 
K = Vl — £ 2 and K = K(k) is the complete elliptic integral. The C0j are the locations in x where 

£2n-l,2n(*) = 1- 

We plot in Figure [I] the maximal error of the Zolotarev approximation over the interval [e, 1] 
as a function of L s for various values of £. The maximal error of the tanh approximation over the 
same interval is very close to 1 - the maximal error occurs at x = e. 
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The kernel H - the auxiliary Dirac operator - is what determines the "physics" of the problem. 
For example, this is where the discretization errors enter. There are an infinite number of choices 
here, but some common variants are as follows. The overlap-like version is H = H w = y$D w {—M) 
where D w (—M) is the Wilson-Dirac operator with supercritical mass 1 < M < 2. In the Shamir 
domain wall version, the induced operator is H = Hj = H w (2 + a$D w ) ~~ 1 . The Mobius version [ 14 ] 
H — (t>5 + cs)H w [2 + (&5 — cs)D n ]~ l interpolates between H w and Hj. 

One could use smeared links within the operator to improve the spectrum (see e.g. [16]). 
We require H to be local, and differentiable (for dynamical fermion calculations). Ideally, the 
approximation should cover the spectrum of H. Different choices of H lead to different spectra and 
different O(o , )-discretization effects in H and therefore to different 0(a 2 ) effects in the resulting 
D4 operator. 



2. Introducing the Schur Decomposition 

The framework we shall use for 5D chiral fermion operators is based on Schur decomposition. 
Using a 5D form gives the obvious benefit of inversion of 4D chiral operator within a single (5D) 
Krylov space. Furthermore, this framework allows for a formal reduction of the 5D theory to 
the 4D theory. It also allows for a formal derivation of the determinant of the 5D operator for 
5D dynamical fermion calculations. Even-odd preconditioning can also be accommodated in this 
framework. 

Consider the 5D block matrix 

with D a 4D sub-matrix. The Schur decomposition of M5 is: 

M 5 =LDU=[ " " I I ~ ) where 5 = D-CA~ 1 B (2.2) 




is referred to as the Schur Complement. The essence of the 5D framework is that D^{m) (or D^{m)) 
is the 4D Schur Complement of M$ (or of a unitary transformation of M5). The use of the Schur 
Complement in 5D formulations was advocated in ^ in connection with multigrid techniques. 

The Schur decomposition automatically provides inversion of the 4D system through inversion 
of the 5D system. We have 






0Sj{ y 4 ) = 
and it is clear, that solving the 5D system solves the 4D system Sy/4 = %a- 




(2.3) 



Also referred to as the domain wall height 
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The Schur decomposition also provides the determinant and the effective 4D operator. By 
inspection of eq. (|^), we see that det(L) = det(f7) = 1=> det(M 5 ) = det(A)det(5). As a conse- 
quence, we need an operator that will cancel bulk 5D modes in M5 (i.e., det(A)). This operator is 
often called the Pauli-Villars operator which we define here as follows: 



M 



PV 



LDp V U, D 



PV 



A 
1 



det(Mpy) = det(Dpy) = det(A) 



(2.4) 



Formally at least, M$M P y is an effective 4D operator: 

' \ ; /io\ : ,/o 



M5M. 



PV 



Y4 



¥4 



Sy 4 



(2.5) 



We can now connect the Schur complement formalism to the 5D action and path integral (the 
constraints). Requiring the existence of A -1 and the positivity of S only, we have for the partition 
function: 

Z = J dUdydy exp{-\fr(M 5 M P y) V} = J dUdydy exp \ -y/L ( Q ^ J L _ V \ ( 2 - 6 ) 



dUdxdx exp J -% f J ° J x 



oc j dUdXAdXA exp{-^5^ 4 } . 



defining % = yL,x = L V 



(2.7) 
(2.8) 



The path integral needs MpyM^ 1 , but in general Mpy involves A -1 (in L) which can be a dense 
operator. However, one can always construct Dpy which needs only A The resulting partition 
function is 

Z = j dUd(j)U(j) drfdr] expj-^M^ -^Dpyrj} = J rft/det(A)det(S)/det(A) = J dUdet(S) 

(2.9) 

In Hybrid Monte Carlo, one uses M^M 5 , D PV D PV to ensure integrals converge. Single flavor theo- 
ries can be simulated with RHMC and taking the square and fourth roots of the squared operators. 



3. Matrix Representations 

We now turn to the specific 5D representations used in this work. For purposes of illustration, 
we will consider unpreconditioned systems with short 5D extents. 

Partial Fraction: Let us begin with the partial fraction operator of ref. JFT]]. We show below 
partial fraction expansion with only two poles. The operator is 



Ms 



H 










' 




-H 
















H 


-y/qi 













-H 







y/P2 







R75+P0H 



\ 



( 



B 



(3.1) 
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and the Schur Complement is just the pole approximation to D^(m): 



S = R Y5+ p H + 



Pi 



+ 



Pi 



H + q\H~ l H + q 2 H- 1 
and all the results of the section |2| follow. 



D% (m) (3.2) 



Continued Fraction: For the continued fraction representation, we illustrate the matrix in the 
2n + 1 = 3 case: 

/ PiH 1 \ 

M 5 = 1 —f$\H 1 (3.3) 

\ 1 Rys + PoHj 
We apply the Schur decomposition recursively. In order to form S we need to invert it using its 
Schur complement Si and so forth. Finally we arrive at 



(3.4) 



where 





( 1 







(s 2 







( 1 S 2 1 \ 


M 5 = 




1 


!) 





Si 


3 


o i sr 1 




\ 


sr 1 




^° 







[o 1 ) 



S 2 = PiH, S x =-$ l H-S 2 - i 

S = RYs+PoH-Si 1 =Rr 5 +PoH + (frH + (p 2 H)- 1 )- 1 
w (w) 

and the outermost Schur complement is S = and all the previous results apply. 



(3.5) 
(3.6) 
(3.7) 



Domain Wall: The domain wall formalism can be written in a Cayley transform Schur comple- 
ment system |TTJ, 12, 13, 14]. To simplify the formalism, we apply a unitary transformation 
(defined in Jl2[]) on the domain wall kernel to bring each chiral half of the physical field onto the 
same wall. We find that (using a length of In = 4 in the 5th dimension for illustration): 
















1 


_T-1 
1 3 











1 


-T 4 - l c+ 


-Tf 1 








C- 



\ 



where 



1 - (OjH 



c± 



1 — m 1 +m 



(3.8) 



(3.9) 



The Schur complement is: 



c_+rr 1 r,- 1 2r 1 r, 



<T- l + l)Y5 



2 ± 3 ± A c + , define T 
l+m 1—m r _1 



1 



rf 1 r 2 - 1 r 3 - 1 r 4 - 1 

= M 5 (m= l)Z) 4 (m) 



(3.10) 



2 ' r 1 + i 

We find that the domain wall system is a little different from the others. We see that det(A) = 1 
and 5 = M${m = \)D^{m). On the other hand, one can define Mpy = Ms(m= 1) which simplifies 
its use and provides a local 5D operator. This result implies MpyMs = D 4 {m) is an effective 4D 
operator. Remaining results then follow from the Schur complement machinery. 
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V=lfr 32, Ls=12,M=1.8, Shamir m si ^0.02, 15 Configs, matched Borici m v =0.01 15 

« 5D m correlator (Moebius DWF, H_w, Higham Appioxl 
□ 4D m correlator (A L /Pseudo), (H_w, Higham Approx, 4D pan. frac.) 



0.0O0I((J 



4D m ra fit: 

bare rrT _ = 0.000110(-12,13) 
X 2 id.o.f = 1.33 



5D m res fit: 

bare m^ = 0.0001 14(-13.+ 14) 
X J ./d.o.f = 1.27 



ft 



2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 



Figure 2: Comparison of the determination of m res using either a fit in 4D or 5D to m res (t). 



Preconditioning and Implementation Tricks: Suppose our operator H is a ratio of two opera- 
tors, H = HnHq 1 (e.g., the kernel is Hj or //Moebius-) In this case, we naively have an inversion 
in our 5D operator, resulting in a nested inversion. We overcome this, by rationalizing the denom- 
inator and solving M^HdHj^^i = %. We proceed in two steps, first solving (MsHo)y' = then 
Y = Ho\j/. After collecting terms in M5//0, we have just one extra H application to reconstruct y. 

All three formulations allow continuous equivalence transformations on the approximation 
coefficients which may be used to make the operator better conditioned. 

Even-odd preconditioning follows directly from the Schur decomposition framework devel- 
oped so far. The main consideration is the even-odd block decomposition of the 5D operator 2 . 
If 

M 5 = ( M ~ Me ° ) = ( 1 _, ° H M « °_, li; ~" "~ I (3.11) 




^M oe M 00 J yM oe M ee V V M oo-M ue M ee M<- 

and if M ee preserves the structure of M5, then the Schur decomposition of M5 suggests the efficient 
application of Mj e : one should Schur decompose M ee . 

For the representations presented here, the efficient use of even-odd preconditioning depends 
on H. For general H = //Moebius, we see that M^ 1 does not depend on the gauge fields and can be 
efficiently applied to a vector. 



4. Numerical Results 

4.1 Which 5D Fermion Action? 

We want to evaluate the "cost" of various chiral fermion actions. Here, we only consider 5D 
inversions for use in the force term in HMC and no eigenvector projection is used. We thus have 
a residual mass. We choose a winner using a cost metric, which is the number of Wilson-Dirac 
applications for fixed residual mass. Using the denominator rationalization trick mentioned earlier 
we find that if a general Mobius DWF operator requires 2n applications of D w , the corresponding 

2 This is also a Schur Decomposition 
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Figure 3: Comparison of the pion correlator and effective mass for various actions, demonstrating how 
different variants of 5D operators reduce to the same effective 4D operator. The different kernels H w and 
Ht produce different pion correlators, but the pion mass is tuned to the same value. The difference in mass 
coming from the difference between the tanh and Zolotarev coefficients is too small to show up on these 
plots. 




0.01 
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Figure 4: In the upper panel is shown the residual mass per configuration for various 5D actions using H = 
H w . Here, "Zolo" is the Zolotarev version of Continued Fraction, "Tanh" is the tanh version of Continued 
Fraction, "NEF" is the tanh domain-wall like variant, and "Zolo NEF" is the Zolotarev domain-wall like 
variant. In the lower panel is the ratio of extremal eigenvalues for H w per configuration. 



Continued or Partial fraction operators require 2n + l such applications. In the case of H = H w , all 
formulations require only 2n applications of D w (since j8o = in the continued fraction and po = 
in the partial fraction case). 

4.2 Residual chiral symmetry breaking 

For our cost comparisons, we note that the 5D definition of the residual mass has been related 
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to the 4D effective operator [|14|, [TSp : 

ZdQxYsQx qoYsqo) Lt Tr ( D l ( m ) )x,o \y D 4 1 ( m )yfi 

( 1 - m) 2 {Z x qxYsqx qoYsqo) ~ £ A . Tr G^G^o 

Here, G = y^— (D4 (m) — 1) is the usual subtracted propagator of the 4D overlap formalism, and 
A is the breaking term in the Ginsparg-Wilson relation from the approximation where e / sgn: 

27 5 A=7 5 D4(0)+D4(0)7 5 -2D4(0)7 5 D 4 (0)^A = i(l-£ 2 (//)) . (4.2) 

where we have used D 4 (0) = ±(1 + y$e(H)). We see that m res ^ is purely a deficiency of the 



approximation s(H) m sgn(H). Further, the numerator in eq. (4.1) can easily be computed in 4D 
using the pole approximation of s(H). Thus, we can compare chiral breaking for all formulations 
on an equal footing. 

Summation is carried out over both spatial and temporal coordinates in (|4~l|). If one does not 
sum over time, the resulting time-sliced correlation function is not identically equal to the usual 
Domain Wall m res (t). However, since A xv is a local operator, at low energies the two are very 
similar as can be seen by observing fig. ||. Further, at the cost of introducing a small contact term 
on the source time-slice and a term that vanishes in the ensemble average one may replace the D 1 
terms in eq. ( ft.l| ) by the subtracted propagators G. In our computations, this simplification was 
used, however, apart from figure ||] we always carried out the temporal sum. 

4.3 Setup 

For our numerical tests, we used 15 Nj ■ = 2 domain- wall fermion configurations (provided by 



RBC [ 19]) with L s = 12 and am q = 0.02 corresponding to m n = 500MeV. We compare the cost of 
inversions for even-odd preconditioned Mobius variants of 5D Partial Fraction, Continued Fraction, 
and Domain Wall operators. We use only a CGNE inverter. Unfortunately, BiCG-stab and MR are 
not convergent. In all tests, we used a target residual of 10~ 7 and M = 1.8 in D w . We tuned the 
pion mass of our 5D operators using H = H w to match the domain-wall pion mass which resulted 
in an am q = 0.0115. 

We computed the cost (number of Wilson-Dirac applications) for multiple L s with the tanh 
and Zolotarev approximations. For the latter, we found the largest eigenvalue of Hj and H w to be 
about 1.6 and 5.8, respectively. We set the upper bound of the approximation to this X max , and set 
the lower bound to & fixed e x X max where e = 0.01 in practice. 

4.4 Equivalence of representations/variants 

In Figure [|, we show the pion correlator for different representations and approximations at 
the tuned pion mass. We see that the deviations from each other are small and under control. 

4.5 m res per configuration 

In Figure || we show a plot of m res per configuration for various actions using H = H w . We 
see that the "tanh" versions of domain-wall and continued fraction agree as expected. Also, the 
Zolotarev versions of domain-wall and continued fraction agree for the first configuration. We did 
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Cost vs Mres 

Dyn. DWF (Ls=12), m=0.020 




I m I (m„. . /m) 

res Shamir 

Figure 5: Cost in Wilson-Dirac applications vs. m les for the most promising 5D actions. 

not measure the Zolotarev version of the domain wall for the remaining configurations as the cost 
appeared too expensive, however, we expect similar agreement we see for the first configuration. 
We also see that in the "tanh" approximation cases the m res is notably higher than in the Zolotarev 
cases. This is consistent with the much larger approximation errors e (x) in polar form compared to 
Zolotarev. 

The notable exception to a small m res is configuration 10 where the lowest eigenvalue of H w is 
quite small. We find that fixing the approximation range is preferable to allowing it to vary. What 
we are seeing is an interplay of the density of eigenvalues of H w and the approximation accuracy. 
The eigenvectors of H w are also eigenvectors of A which can be written as / dXp{X)^{X) = 
(1/4) J dkp{X){\ — e(A) ). If the density of small modes is low, it is not worth increasing the 
approximation range to accommodate the low modes. Increasing the approximation range increases 
the maximum error which thus magnifies A for all modes - not just the smallest modes. In our tests 
we therefore fixed the maximal range of the approximation. We note that is no worse than a "tanh" 
approximation where the approximation is fixed for a given order of the rational approximation. 

4.6 Cost versus m res 

In our tests we compare the cost of inverting the even-odd preconditioned 5D operators versus 
m res . We note that this preconditioning gives between a factor of 1.5-3 speedup over the unprecon- 
ditioned systems. The cost of applying M^ e l is negligble in flops; however, some care is needed in 
writing optimized codes due to memory loads in M ee . The goal is small m res for small cost. 

In Figure [5] we plot the cost versus m res for the most promising actions. Many of the Zolotarev 
variants were very expensive and not shown. We remark, however, that we did not explore possible 
further tuning of the condition of our operators via equivalence transformations on our coefficients, 
which may reduce the cost of these expensive variants. Having said this, of all the actions we 
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Cost vs A 

Dyn. DWF (Ls=12), m =0.020 




CFZ: L =8 



'sliamir (a-1,' Hj,, Higham) ' ^ 
Continued Fraction (H , Zolotarev) 
Continued Fraction (H Zolotarev) 



DWF: L =12 



lc-06 



le-05 



0.0001 



0.001 



Figure 6: Cost in Wilson-Dirac applications vs. A 2 for some of the 5D actions in Figure ^| 



tested, the standard domain-wall (Hj) is the least effective. The Zolotarev Continued Fraction 
(H w and Ht) are the top candidates. The Zolotarev Partial Fraction version is promising. We 
note that rescaling the coefficients b$ + C5 in flivioebius Wlt h b$ — C5 = 1 is equivalent to a rescaling 
a = b$ +C5 in e{aHj). For the tanh approximation, we can slide the Hj down inside e{Hj) to 
lower approximation errors. We see this rescaling results in a corresponding decrease in m res at no 
cost increase. 

In Figure || we plot the cost versus the second moment A 2 . We see no large shift in cost 
corresponding to wild oscillations that cancel in m res but not in A 2 . 



5. Summary and conclusions 

We have presented a unified framework for 5D chiral operators. The crux of the framework 
is the Schur decomposition and Schur complement techniques. The framework is agnostic about 
representation, approximation or the kernel. The framework accommodates (and can help with) 
even-odd preconditioning. In practice, the computational cost of a 5D formulation depends upon 
the spectral properties of M5 and upon m res . While m res is an artifact of the approximation e{H), 
it depends not only on how good the approximation is, but also how much of the spectrum of H 
is not well approximated (e.g., the near-zero modes of H). The condition of M5 depends upon 
its matrix structure and coefficients, e.g., the approximation and representation and potentially on 
further tuning of the condition through equivalence transformations and permutation symmetry. 

We have presented a detailed comparison of the "cost" of various chiral fermion actions, where 
cost is the number of Wilson-Dirac applications for fixed m res in a conjugate gradient solver. We 
have found the standard domain-wall action is the least effective. The Zolotarev versions of Con- 
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tinued Fraction (with H = H w and Hj) appear to be among the best. Future work will focus on 
dynamical fermion implementations of this operator. 
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